Performance Measures for Multi-Graded Relevance
نویسندگان
چکیده
We extend performance measures commonly used in semantic web applications to be capable of handling multi-graded relevance data. Most of today's recommender social web applications o er the possibility to rate objects with di erent levels of relevance. Nevertheless most performance measures in Information Retrieval and recommender systems are based on the assumption that retrieved objects (e. g. entities or documents) are either relevant or irrelevant. Hence, thresholds have to be applied to convert multi-graded relevance labels to binary relevance labels. With regard to the necessity of evaluating information retrieval strategies on multi-graded data, we propose an extended version of the performance measure average precision that pays attention to levels of relevance without applying thresholds, but keeping and respecting the detailed relevance information. Furthermore we propose an improvement to the NDCG measure avoiding problems caused by di erent scales in di erent datasets.
منابع مشابه
Cumulated Gain-based Indicators of Ir Performance
Modern large retrieval environments tend to overwhelm their users by their large output. Since all documents are not of equal relevance to their users, highly relevant documents should be identified and ranked first for presentation to the users. In order to develop IR techniques to this direction, it is necessary to develop evaluation approaches and methods that credit IR methods for their abi...
متن کاملNugget-Based Computation of Graded Relevance
We propose a simple method for assigning graded relevance values to documents judged during the course of a retrieval experiment. In making this proposal, we aim to avoid the potential for ambiguity and greater cognitive load associated with standard graded relevance judgments. Under our proposal, we first decompose a retrieval topic into a number of informational nuggets. For each document, a ...
متن کاملSurvey of graded relevance metrics for information retrieval
A large number of metrics are available to evaluate the quality of rank web pages in information retrieval (IR). These metrics can be classified in different groups as follows: Binary Relevance, Graded Relevance, Rank Correlation Coefficient, and User Oriented Measures. Each group of metrics has difference characteristics. However, metrics that contains in the same group have the similar charac...
متن کاملَA Multi-objective simulated annealing algorithm to solving flexible no-wait flowshop scheduling problems with transportation times
This paper deals with a bi-objective hybrid no-wait flowshop scheduling problem minimizing the makespan and total weighted tardiness, in which we consider transportation times between stages. Obtaining an optimal solution for this type of complex, large-sized problem in reasonable computational time by using traditional approaches and optimization tools is extremely difficult. This paper presen...
متن کاملPredicting Relevance based on Assessor Disagreement
We present the Predicted Relevance Model (PRM): it allows moving from binary evaluation measures that reflect a single assessor’s judgments, towards graded measures that represent the relevance towards random users.
متن کامل